Estimating the self‐thinning boundary line for oak mixed forests in central China by using stochastic frontier analysis and a proposed variable density model

Abstract A suitable self‐thinning model is fundamental to effective density control and management. Using data from 265 plot measurements in oak mixed forests in central China, we demonstrated how to estimate a suitable self‐thinning line for oak mixed forests from three aspects, i.e., self‐thinning models (Reineke's model and the variable density model), statistical methods (quantile regression and stochastic frontier analysis), and the variables affecting stands (topography and stand structure factors). The proposed variable density model, which is based on the quadratic mean diameter and dominant height, exhibited a better goodness of fit and biological relevance than Reineke's model for modeling the self‐thinning line for mixed oak forests. In addition, the normal‐truncated normal stochastic frontier model was superior to quantile regression for modeling the self‐thinning line. The altitude, Simpson index, and dominant height–diameter ratio (Hd/D) also had significant effects on the density of mixed forests. Overall, a variable density self‐thinning model may be constructed using stochastic frontier analysis for oak mixed forests while considering the effects of site quality and stand structure on density. The findings may contribute to a more accurate density management map for mixed forests.

affect overall forest quality and resources in oak distribution areas and impact regional ecological functions. Therefore, understanding the stand density structure, growth, and management of oak forests is particularly important and urgent.
Self-thinning in forests is caused by limited resources, including light and growing space (Pretzsch & Biber, 2005). The number of trees decreases when the available resources cannot meet the normal demands of growing trees (Vospernik & Sterba, 2015;Yang & Burkhart, 2017). Reineke's stand density index is based on a linear relationship between the logarithm of the stand density and the logarithm of the mean diameter, and the slope of the linear model is −1.605 (Reineke, 1933). The −3/2 rule outlined by Yoda et al. (1963) relates the mean plant volume (or biomass) to the number of plants per unit area. Other density indexes, such as those presented by Hart (1926) and West et al. (1997), have also received widespread attention. These rules are scientifically meaningful for forestry research. However, several studies have challenged these conclusions because the slope did not equal −1.605 or −3/2 in all studies (Gadow, 1986;Westoby & Howell, 1986;Zeide, 1985). Moreover, the maximum size-density relationship (MSDR) under different environmental conditions or tree species can differ greatly (Charru et al., 2012;Vospernik & Sterba, 2015). Despite remarkable advances, the self-thinning rule continues to be a controversial issue, and there are still many problems to solve. Currently, three aspects should be considered when estimating the self-thinning line of a tree species: a self-thinning model, statistical methods, and the variables that affect the MSDR.
In general, self-thinning models are built with stand density as the dependent variable and tree size (such as the mean diameter, mean height, or mean biomass) as the independent variable (Hart, 1926;Reineke, 1933;Yoda et al., 1963). After analyzing the maximum size density using reasoning and empirical evidence, Zeide (1985Zeide ( , 1987Zeide ( , 1991 and Burkhart (2013) concluded that measurements based on the diameter are preferable. The number of trees was most closely related to the average crown, but the height and tree volume were not as highly correlated with the crown width as the average diameter. Zeide (2005) further specified that an increase in tree height did not affect the stand density because trees grow side-by-side rather than on top of each other. The average height of a stand does F I G U R E 1 The status of oak mixed forest in central China F I G U R E 2 Three cases for the effect of tree height on stand density (H is the dominant height, N is the stand density) not change the amount of light available to the trees growing in a given area; it only lifts the crowns without securing additional light per unit area ( Figure 2a). However, the conclusions presented by Zeide (2005) were based on plantation forests, and it is not clear if the conclusions are applicable to mixed forests because the characteristics of diameter and height were significantly different from those of plantation forests. It is possible that the effects of height on the density in mixed forests are different from those in plantation forests in the following two aspects.
1. Tree height influences the size of the crown, and the crown influences the stand density.
When other factors (such as the diameter and wood quality) are equal, the correlation between the height and the crown dimensions is negative (Briegleb, 1952;Harding & Grigal, 1985;Ouellet, 1985).
This means that taller trees must have smaller crowns than shorter trees with the same diameter (Zeide, 2005). As shown in Figure 2b, when the diameter of tree i is kept constant, an increased height will lead to a decrease in the crown width. Then, the canopy overlaps and competition is reduced in the stand, which may lead to a decrease in the death rate. Moreover, the growth space occupied in the horizontal dimension will be reduced for each tree, and the number of trees per unit area may increase, leading to an increase in the stand density.
2. The dominant height influences the number of canopy levels, and the number of canopy levels influences the tree number.
Unlike a single canopy level in plantation forests, two or more canopy levels are often found in mixed forests. As shown in Figure 2c, only two canopy levels were found in the mixed forests when the dominant height was low, and the tree number with the maximum stand carrying capacity was 8. The number of canopy levels may increase when the dominant height increases, and then the growth space occupied in the vertical dimension will be enlarged, which leads to an increase in the stand density (e.g., the tree number increases to 10).
Applying different statistical methods leads to different conclusions regarding the validity of the self-thinning model. In previous studies, ordinary least squares (OLS) were frequently used for fitting the self-thinning model, and its validity relies on the data points reflecting the MSDR. Four methods for selecting data points have been proposed: the visual method (Osawa & Allen, 1993;Osawa & Sugita, 1989), quantifications of mortalities in successive measurements (Westoby, 1984), the selection of points from each equalwidth interval (Bi & Turvey, 1997;Newton, 2006) and the relative density method (Solomon & Zhang, 2002). However, such methods still harbor a degree of subjectivity and may lead to inaccurate results due to a lack of experience. Fortunately, with the advancement of modern statistical methods, the self-thinning line can be fitted using principal component analysis (Weller, 1987), stochastic frontier analysis (Bi et al., 2000;Charru et al., 2012), and linear quantile regression (Vospernik & Sterba, 2015;Zhang et al., 2013). Due to their favorable inferential properties, stochastic frontier analysis and quantile regression are widely considered to be the current state-of-the-art methods for estimating the self-thinning line (Andrews et al., 2018;Bi, 2001;Kimsey et al., 2019). The results of previous studies favored stochastic frontier analysis, although quantile regression performed nearly as well in some cases (Salas-Eljatib & Weiskittel, 2018). Accordingly, it is necessary to further explore and compare the two methods in terms of suitability and validity, especially for mixed forests of specific species.
Studies have shown that the MSDR influences tree species, site quality, nutrient availability, climate, and other factors, which is why the intercept and slope of the self-thinning line change (Comeau et al., 2010;Harper, 1977). In general, shade-tolerant species show a higher "stock ability" than shade-intolerant species (Charru et al., 2012;Pretzsch & Biber, 2005). Stand origin has been shown to affect the slope of the self-thinning line, with planted stands having a less steep slope (Charru et al., 2012;Weiskittel et al., 2009). The soil nutrient regime has a positive effect on the intercept and a negative effect on the slope of the self-thinning line (Reyes-Hernandez et al., 2013;Weiskittel et al., 2009). The climate has also been shown to have a significant influence on the maximum stand carrying capacity (Aguirre et al., 2018;Condés et al., 2017;de Prado et al., 2020).
Most studies have focused on the influence of habitat factors (e.g., site quality, climate, tree species diversity) on the MSDR, but less attention has been given to stand structure factors (e.g., ratio of height to diameter) Salas-Eljatib & Weiskittel, 2018).
The objective of this study was to explore how to estimate a suitable self-thinning line for mixed oak forests from three aspects: a self-thinning model, a statistical method, and the influential variables. Therefore, we aimed to: (1) propose a new self-thinning model and investigate the suitability of the proposed model for oak mixed forests through a comparison with Reineke's model; (2) evaluate the statistical approach in fitting self-thinning relationships in mixed forests; and (3) relate the stand maximum density to topography and stand structure factors.

| Study site and data
The study site is located in Hunan Province, central China, with longi- (3) the oak species accounted for more than 15% of all trees in the sample plot. Within each plot, the diameter at breast height (DBH) (measured using a diameter tape, minimum recording limit is 5 cm) and dominant height (arithmetic mean of seven dominant trees in the main canopy layer, measured using a Blume-Leiss hypsometer) were measured, and the tree species, trees per hectare, plot position (X and Y coordinates), elevation, slope, aspect, slope position, and soil were recorded (measured using a global positioning system and compass). The dataset was randomly split into two parts: 70% (185 plots) for modeling and 30% (80 plots) for validation. Information on the plots is shown in Table 1.

| Reineke's model (RM)
Reineke's stand density index was used in this study to fit the selfthinning line.
where N is the number of trees per hectare, D g is the quadratic mean diameter, 2 is the species-specific slope, 1 is the species-specific intercept, and ln is the natural logarithm. Yoda et al. (1963) proposed the relationship between the mean tree weight or the mean stem volume w and the stand density N in fully stocked pure stands during the self-thinning process as where k 1 and a is the species-specific parameter.

| Variable density model (VDM)
According to its formulation, w is a power function of the DBH and height: For Equation (2) and Equation (3), where k 1 , k 2 , a, b, and c are constants. Let k 4 = k 2 ∕k 1 1∕a , = b ∕ a, and = c ∕ a. Equation (5) can then be simplified to Equation (6).
Taking the natural logarithm of each side of Equation (6), Equation (7) can be viewed as a variable density model to fit the selfthinning line for mixed forests, where and are the species-specific slopes, k is the species-specific intercept, D is the quadratic mean diameter, and H d is the dominant height.

| Quantile regression
Compared with conventional statistical methods (such as OLS and PCA), quantile regression, which was introduced by Koenker and Basset (1978), is more flexible and provides parameter estimates for linear regressions fit to any portion (i.e., quantile) of a response distribution, including estimates near the observed bounds of the distributions, without imposing stringent assumptions on the distribution of the error (Koenker & Hallock, 2001). The parameter of the τth quantile model can be estimated by the asymmetric loss function that minimizes the absolute value of residuals, as shown in Equation (8) where y i is the vector of the dependent variables, x i is the vector of the independent variables, is a predetermined quantile between 0 and 1, and is the coefficient vector, which varies for different values (Scharf et al., 1998). In general, is 0.900, 0.950, or 0.990 for the self-thinning line in most studies. In this study, five values of (0.900, 0.925, 0.950, 0.975, and 0.990) were used in the quantile regression for estimating the self-thinning line.

| Stochastic frontier analysis
Stochastic frontier analysis has been proposed as an effective and powerful statistical technique that tests for the effect of covariates while allowing the assumption of constant error variance to be relaxed (heteroscedasticity) when all available data are used (Bi et al., 2000;Weiskittel et al., 2009). The original form of the equation is as follows: where Y i is the ith observation, X k is the observation of variables, A and k are the predictor coefficients, and e v i and e u i are model errors.
Taking the natural logarithm of each side of Equation (9)

| Comparative performances of the two models or methods
The performances of quantile regression and stochastic frontier analysis were compared using the maximum stand density index (SDI max ), that is, the maximum number of trees at a given reference average individual size (diameter or height) that can exist in a selfthinning population (Husch et al., 1972). In general, the larger the ratio of actual stand density to SDI max is, the more crowded the stand and the higher the tree mortality. For Reineke's model, the SDI' max is predicted as follows: For the variable density model, the SDI max is predicted as follows: where SDI max an SDI' max are the predicted maximum density at a baseaverage tree size, N is the actual stand density, D g is the actual stand diameter, H g is the realistic stand dominant height, D 0 is the standard stand diameter (the value is 16 cm), and H 0 is the standard stand dominant height (the value is 12 m).
The performances of Reineke's model and the variable density model were compared using the Akaike information criterion (AIC) and the likelihood ratio test (LRT).

| Topography and stand structure factors
One aim of this study was to explore the relationship between SDI max and topography and stand structure factors in mixed forests.
We fit several models of the form as follows: where X i is the predictor variables matrix, f() is a linear or nonlinear function, is a parameter vector of the model, and e i is the random error term that follows a Gaussian distribution with zero mean and where S is the number of tree species, P i is the proportion of the total sample belonging to the ith tree species, and ln is the natural log.
where n is the total number of specific tree species and N is the total number of all tree species. The information of the variables is shown in Table 2. We fitted the self-thinning models using stochastic frontier analysis. The model coefficients of Reineke's model and the variable density model were statistically significant (p < .05) (

| Evaluation of the statistical approach
We fitted the variable density model using quantile regression In addition, the relationship between tree mortality and SD/ SDI max (SD was the actual stand density) was assessed by using a linear function to evaluate the quantile regression and stochastic frontier analysis ( Figure 6). Obviously, the higher the value of SD/ SDI max is, the higher the tree mortality. The fitting accuracy of quantile regression did not show obvious regularity with the increase in quantile value. The t test of residuals showed that the R 2 of the model fitted by stochastic frontier analysis was significantly higher than that of the quantile regression function.
The validation samples were fitted using stochastic frontier analysis. We used a t test to evaluate the difference between the SDI max of the validation samples and modeling samples, which was calculated according to Equation (12). The results showed that there was no significant difference between the average values of the SDI max (p < .05). This demonstrated that the variable density model constructed with a stochastic frontier was reliable and stable. (13)

| Topography, stand structure, and density
Based on a variety of alternative models, SDI max was found to be effectively modeled as a function of altitude, Simpson index, and  Zeide (1985) attempted to modify Reineke's model and proposed a modified SDI equation: ln N = a − b ln D + xH ln D, but the results showed that the inclusion of height did not decrease the model error.

| Effect of the dominant height on density in mixed forests
In our assessment, the inclusion of the dominant height was able to improve the model error because the variable density model fitted by stochastic frontier analysis showed a better goodness of fit than Reineke's model (Table 3). The results demonstrate that the effect of height on the density in the mixed forests was different from that in plantation forests.
Interestingly, the independent variable DBH in Reineke's model and the variable density model was negatively correlated with density (Table 3), but the independent variable H d was positively correlated with density. This result indicates that the inclusion of the dominant height did not affect the self-thinning rate but did affect the maximum stand carrying capacity in the mixed forests.
When the dominant height increases, the number of canopy levels may increase, which may lead to an increase in the stand density ( Figure 2c). However, situation 1 shown in Figure 2b does not hold because the dominant height did not affect the self-thinning rate, and the competition did not decrease. model can predict the maximum stand density and self-thinning line in stands at various growth stages ( Figure 5). Therefore, we recommend fitting the self-thinning line using the variable density model in mixed forests because the measurement of the height does not increase the cost of applying remote sensing technology.
F I G U R E 5 Maximum density line reflecting the relationship between stand density and stand average diameter based on specific tree height (QR is the quantile regression, τ is the quantile value, and SFA is the stochastic frontier analysis).

F I G U R E 6
Tree mortality and SD/ SDI max showed a significant linear relationship in the statistical model estimated by quantile regression ( = 0.900, 0.925, 0.950, 0.975, 0.990) and stochastic frontier analysis.

| Suitable statistical methods to fit the selfthinning model
Although quantile regression and stochastic frontier analysis have become common and preferred methods to fit the self-thinning line, traditional methods remain important and indispensable. The selfthinning model estimated by traditional methods is based on stand growth dynamics and has biological significance, in contrast to quantile regression and stochastic frontier analysis.
Therefore, we also used traditional methods (visual method, mortality method, interval method, and relative density method) to fit the self-thinning model of mixed forest ( Figure 8). Westoby (1984) found that plantation forests began to self-thin when the mortality reached 20% because tree death was the result of a high density.
However, many tree deaths in mixed forests are the result of poor light and limited growth space, and the stand density is not the only reason for mortality. Therefore, the mortality method should be modified to make it suitable for mixed forests, which was confirmed in our study (Figure 8). Rational selection of the relative density standard is the key to modeling the self-thinning line (Solomon & Zhang, 2002), and the self-thinning line within the relative density standard above 1.0 was close to the upper edge of the stand density and showed a suitable goodness of fit in this study (Figure 8). We found that the interval method was able to avoid the disadvantages of the visual method, mortality method, and relative density method ( Figure 8). However, it is possible to include some plots at lower densities that have not reached the stage of self-thinning (Zhang et al., 2005). Consequently, the slope coefficient of the self-thinning line based on this subset of the plots may be flatter than expected (Osawa & Allen, 1993;Westoby, 1984). Overall, the interval method was the most suitable for modeling the self-thinning line among the four traditional methods.
Although quantile regression and stochastic frontier analysis can serve the same purpose of boundary delineation in applied statistics far beyond econometrics, an adequate appreciation of the differences between the two approaches is essential for researchers to select the best method to estimate the self-thinning surface (Tian et al., 2020). The value of chosen for estimating the self-thinning line mainly ranges from 0.90 to 0.99 in the literature, with 0.95 ≤ ≤ 0.99 being the most common choice (Andrews et al., 2018;Condés et al., 2017;Vospernik & Sterba, 2015).
However, without a careful comparison to strike a balance between the quantiles, quantile regression is likely to introduce a certain degree of subjectivity. Compared with the quantile regression, we found that the stochastic frontier analysis was superior in model prediction ( Figure 6). Furthermore, stochastic frontier analysis may lead to a more objective self-thinning line because it does not involve the subjective selection of a particular value of . However, quantile regression can still serve as a valuable complement to stochastic frontier analysis in the estimation of the self-thinning surface, as it allows the impact of variables other than stand density on different quantiles to be examined (Tian et al., 2020).

| Effect of topography and structure on density in mixed forests
The significant correlations between the stand density and the site quality and climate have been confirmed in the literature (Long & Shaw, 2005;Zhang et al., 2013). Site quality has a significant influence on the intercept of the self-thinning line; the better the site quality is, the higher the carrying capacity (Kimsey et al., 2019;Weiskittel et al., 2009). Generally, the stand density should be higher

F I G U R E 7
The relationship between SDI max and H d /D in fixed values of the altitude and Simpson index. The altitude at 0-500 m was set to low (a), and the altitude at 501-1000 m was set to high (b). Simpson index levels were set to low (0-0.5) and high (0.5-1.0).
in areas with low altitudes and slopes due to the negative correlation between the site quality and the altitude and slope. In contrast, a higher density was found at higher altitudes and slopes in this study (2) there is a high population density in the areas with low altitudes and slopes, and frequent anthropogenic activities may result in a decrease in the stand density.
The stand structure (e.g., tree species diversity) has significant effects on the stand density and tree competition (Weiskittel et al., 2009). Unfortunately, significant effects of tree species diversity on stand density were not found in this study ( Figure 7), but the higher the distribution uniformity of tree species was, the higher the stand density. Generally, in a stand, a higher distribution uniformity of tree species will lead to a higher overlap and a lower competition and mortality, which is one of the reasons for the increase in stand density.

ACK N OWLED G M ENT
We thank the Central South Forest Inventory and Planning Institute of State Forestry Administration for providing the data used in this analysis. We also thank AJE (www.aje.cn) for performing English language editing. This project was supported by the National Forestry Public Welfare Research Project of China (No. 201504301).

CO N FLI C T O F I NTE R E S T
The authors declare that they have no known competing financial interests or personal relationships.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data presented in this study have been uploaded to Dryad (https://doi.org/10.5061/dryad.931zc rjnq).